AITopics

Country:

South America > Brazil (0.14)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Poland (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Neural Information Processing SystemsFeb-9-2026, 07:25:26 GMT

5d52b102ebd672023628cac20e9da5ff-Paper-Conference.pdf

accuracy, sparsity, subnetwork, (15 more...)

Country:

Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre: Contests & Prizes (0.50)

Industry: Leisure & Entertainment > Gambling (0.32)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Neural Information Processing SystemsDec-25-2025, 03:30:50 GMT

Patching open-vocabulary models by interpolating weights

Open-vocabulary models like CLIP achieve high accuracy across many image classification tasks. However, there are still settings where their zero-shot performance is far from optimal. We study model patching, where the goal is to improve accuracy on specific tasks without degrading accuracy on tasks where performance is already adequate. Towards this goal, we introduce PAINT, a patching method that uses interpolations between the weights of a model before fine-tuning and the weights after fine-tuning on a task to be patched. On nine tasks where zero-shot CLIP performs poorly, PAINT increases accuracy by 15 to 60 percentage points while preserving accuracy on ImageNet within one percentage point of the zero-shot model. PAINT also allows a single model to be patched on multiple tasks and improves with model scale. Furthermore, we identify cases of broad transfer, where patching on one task increases accuracy on other tasks even when the tasks have disjoint classes. Finally, we investigate applications beyond common benchmarks such as counting or reducing the impact of typographic attacks on CLIP. Our findings demonstrate that it is possible to expand the set of tasks on which open-vocabulary models achieve high accuracy without re-training them from scratch.

accuracy, name change, patching open-vocabulary model, (3 more...)

Genre: Research Report > New Finding (0.60)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Bahloul, Ahmed, Malberg, Simon

From Roots to Rewards: Dynamic Tree Reasoning with Reinforcement Learning

arXiv.org Artificial IntelligenceSep-29-2025

Modern language models address complex questions through chain-of-thought (CoT) reasoning (Wei et al., 2023) and retrieval augmentation (Lewis et al., 2021), yet struggle with error propagation and knowledge integration. Tree-structured reasoning methods, particularly the Probabilistic Tree-of-Thought (ProbTree)(Cao et al., 2023) framework, mitigate these issues by decomposing questions into hierarchical structures and selecting answers through confidence-weighted aggregation of parametric and retrieved knowledge (Yao et al., 2023). However, ProbTree's static implementation introduces two key limitations: (1) the reasoning tree is fixed during the initial construction phase, preventing dynamic adaptation to intermediate results, and (2) each node requires exhaustive evaluation of all possible solution strategies, creating computational inefficiency. We present a dynamic reinforcement learning (Sutton and Barto, 2018) framework that transforms tree-based reasoning into an adaptive process. Our approach incrementally constructs the reasoning tree based on real-time confidence estimates, while learning optimal policies for action selection (decomposition, retrieval, or aggregation). This maintains ProbTree's probabilistic rigor while improving both solution quality and computational efficiency through selective expansion and focused resource allocation. The work establishes a new paradigm for treestructured reasoning that balances the reliability of probabilistic frameworks with the flexibility required for real-world question answering systems. Code available at: https://github.com/ahmedehabb/From-Roots-to-Rewards-Dynamic-Tree-Reasoning-with-RL

large language model, machine learning, reinforcement learning, (21 more...)

2507.13142

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Cho, Younggeol, Solak, Gokhan, Nocentini, Olivia, Lorenzini, Marta, Fortuna, Andrea, Ajoudani, Arash

Anticipatory Fall Detection in Humans with Hybrid Directed Graph Neural Networks and Long Short-Term Memory

arXiv.org Artificial IntelligenceSep-9-2025

Detecting and preventing falls in humans is a critical component of assistive robotic systems. While significant progress has been made in detecting falls, the prediction of falls before they happen, and analysis of the transient state between stability and an impending fall remain unexplored. In this paper, we propose a anticipatory fall detection method that utilizes a hybrid model combining Dynamic Graph Neural Networks (DGNN) with Long Short-Term Memory (LSTM) networks that decoupled the motion prediction and gait classification tasks to anticipate falls with high accuracy. Our approach employs real-time skeletal features extracted from video sequences as input for the proposed model. The DGNN acts as a classifier, distinguishing between three gait states: stable, transient, and fall. The LSTM-based network then predicts human movement in subsequent time steps, enabling early detection of falls. The proposed model was trained and validated using the OUMVLP-Pose and URFD datasets, demonstrating superior performance in terms of prediction error and recognition accuracy compared to models relying solely on DGNN and models from literature. The results indicate that decoupling prediction and classification improves performance compared to addressing the unified problem using only the DGNN. Furthermore, our method allows for the monitoring of the transient state, offering valuable insights that could enhance the functionality of advanced assistance systems.

artificial intelligence, keypoint, machine learning, (17 more...)

2509.05337

Country: Europe (0.68)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsAug-18-2025, 08:31:12 GMT

Patching open-vocabulary models by interpolating weights Gabriel Ilharco 1 Mitchell Wortsman

However, there are still settings where their zero-shot performance is far from optimal.

accuracy, machine learning, natural language, (14 more...)

Country:

South America > Brazil (0.14)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Poland (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Neural Information Processing SystemsAug-15-2025, 04:04:01 GMT

5d52b102ebd672023628cac20e9da5ff-Paper-Conference.pdf

accuracy, sparsity, subnetwork, (14 more...)

Country:

Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre: Contests & Prizes (0.50)

Industry: Leisure & Entertainment > Gambling (0.32)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Putra, Rachmad Vidya Wicaksana, Iftikhar, Saad, Shafique, Muhammad

QSViT: A Methodology for Quantizing Spiking Vision Transformers

arXiv.org Artificial IntelligenceApr-1-2025

Vision Transformer (ViT)-based models have shown state-of-the-art performance (e.g., accuracy) in vision-based AI tasks. However, realizing their capability in resource-constrained embedded AI systems is challenging due to their inherent large memory footprints and complex computations, thereby incurring high power/energy consumption. Recently, Spiking Vision Transformer (SViT)-based models have emerged as alternate low-power ViT networks. However, their large memory footprints still hinder their applicability for resource-constrained embedded AI systems. Therefore, there is a need for a methodology to compress SViT models without degrading the accuracy significantly. To address this, we propose QSViT, a novel design methodology to compress the SViT models through a systematic quantization strategy across different network layers. To do this, our QSViT employs several key steps: (1) investigating the impact of different precision levels in different network layers, (2) identifying the appropriate base quantization settings for guiding bit precision reduction, (3) performing a guided quantization strategy based on the base settings to select the appropriate quantization setting, and (4) developing an efficient quantized network based on the selected quantization setting. The experimental results demonstrate that, our QSViT methodology achieves 22.75% memory saving and 21.33% power saving, while also maintaining high accuracy within 2.1% from that of the original non-quantized SViT model on the ImageNet dataset. These results highlight the potential of QSViT methodology to pave the way toward the efficient SViT deployments on resource-constrained embedded AI systems.

artificial intelligence, machine learning, quantization, (17 more...)

2504.00948

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Europe > Italy > Lazio > Rome (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.98)

Neural Information Processing SystemsJan-18-2025, 17:39:46 GMT

Patching open-vocabulary models by interpolating weights

accuracy, increase accuracy, patching open-vocabulary model, (1 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.77)
Information Technology > Artificial Intelligence > Machine Learning (0.42)

arXiv.org Artificial IntelligenceDec-18-2024

MagicPIG: LSH Sampling for Efficient LLM Generation

Chen, Zhuoming, Sadhukhan, Ranajoy, Ye, Zihao, Zhou, Yang, Zhang, Jianyu, Nolte, Niklas, Tian, Yuandong, Douze, Matthijs, Bottou, Leon, Jia, Zhihao, Chen, Beidi

Large language models (LLMs) with long context windows have gained significant attention. However, the KV cache, stored to avoid re-computation, becomes a bottleneck. Various dynamic sparse or TopK-based attention approximation methods have been proposed to leverage the common insight that attention is sparse. In this paper, we first show that TopK attention itself suffers from quality degradation in certain downstream tasks because attention is not always as sparse as expected. Rather than selecting the keys and values with the highest attention scores, sampling with theoretical guarantees can provide a better estimation for attention output. To make the sampling-based approximation practical in LLM generation, we propose MagicPIG, a heterogeneous system based on Locality Sensitive Hashing (LSH). MagicPIG significantly reduces the workload of attention computation while preserving high accuracy for diverse tasks. MagicPIG stores the LSH hash tables and runs the attention computation on the CPU, which allows it to serve longer contexts and larger batch sizes with high approximation accuracy. MagicPIG can improve decoding throughput by up to $5\times$ across various GPU hardware and achieve 54ms decoding latency on a single RTX 4090 for Llama-3.1-8B-Instruct model with a context of 96k tokens. The code is available at https://github.com/Infini-AI-Lab/MagicPIG.

large language model, machine learning, magicpig, (20 more...)

2410.16179

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)